Removing phase mismatches in concatenative speech synthesis
نویسنده
چکیده
Concatenation of acoustic units is widely used in most of the currently available text-to-speech systems. While this approach leads to higher intelligibility and naturalness than synthesis-by-rule, it has to cope with the issues of concatenating acoustic units that have been recorded in a di erent order. One important issue in concatenation is that of synchronization of speech frames or, in other words, inter-frame coherence. This paper presents a novel method for synchronization of signals with applications to speech synthesis. The method is based on the notion of center of gravity applied to speech signals. It is an o -line approach as this can be done during analysis with no computational burden on synthesis. The method has been tested with the Harmonic plus Noise Model, HNM, on many large speech databases. The resulting synthetic speech is free of phase mismatch (inter-frame incoherence) problems.
منابع مشابه
Synchronization of speech frames based on phase data with application to concatenative speech synthesis
Synchronization of speech frames is an important issue in a concatenative speech synthesis system. In terms of signal processing this is translated in removing linear phase mismatches between concatenated speech frames. This paper presents two novel approaches to the problem of synchronization of speech frames with an application to concatenative speech synthesis. Both methods are based on a pr...
متن کاملRemoving linear phase mismatches in concatenative speech synthesis
Many current text-to-speech (TTS) systems are based on the concatenation of acoustic units of recorded speech. While this approach is believed to lead to higher intelligibility and naturalness than synthesis-by-rule, it has to cope with the issues of concatenating acoustic units that have been recorded at different times and in a different order. One important issue related to the concatenation...
متن کاملOn the Detection of Discontinuities in Concatenative Speech Synthesis
Last decade considerable work has been done in finding an objective distance measure which is able to predict audible discontinuities in concatenative speech synthesis. Speech segments in concatenative synthesis are extracted from disjoint phonetic contexts and discontinuities in spectral shape and phase mismatches tend to occur at unit boundaries. Many feature sets —most of them of spectral na...
متن کاملEfficient Speech Synthesis System using the Deterministic plus Stochastic Model
In this paper, a high-quality concatenative synthesis system using the deterministic plus stochastic model of speech is described, in which the prosodic modifications are performed by means of very simple and efficient operations, as we reported in a previous work [11]. In particular, pitchsynchrony is not necessary, and linear interpolations substitute other types of estimation. The method for...
متن کاملمراحل و نحوه ی تهیه ی دادگان های صوتی هجایی و دایفونی برای سامانه ی تبدیل متن به گفتار فارسی
Abstract Speech databases are part of the concatenative text to speech synthesis systems. Phonetic quality of the databases plays a significant role in the naturalness of the synthesized speech. This paper introduces two syllable and diphone speech databases for Persian and investigates the way of their development and their specifications and their advantages to each other. ...
متن کامل